302 



306 



Visual Signal 



Extract head and hand 
position data 



Compute velocity, 
acceleration 



Classify and segment 
movements 



320 



-322 



324 



Audio Signal 



Extract pitch, voiceless 
intervals 



326 



330 



332 



Detect prominent intervals 



Classify Pitch accent 



Align movement features 
and prominent pitch 
segments 



336 



334 



Create statistical model for 
each gesture for which 
meaning is known 



-338 



End of modelling process 



FIGURE 3 



402 



406 



Visual Signal 



Extract head and hand 
position data 



Compute velocity, 
acceleration 



Segment movements 



420 



-422 



-424 



Audio Signal 



Extract pitch, voiceless 
intervals 



Detect prominent intervals 



426 



430 



432 



Classify Pitch accent 



Align movement features 
and prominent pitch 
segments 



436 



434 



Compare with statistical 
model for each gesture 



-438 



End 



FIGURE 4 



